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Abstract 

Background: Systematic review (SR) of randomized controlled trials (RCT) is the gold standard for informing treatment 
choice. Decision analyses (DA) also play an important role in informing health care decisions. It is unknown how often 
the results of DA and matching SR of RCTs are in concordance. We assessed whether the results of DA are in 
concordance with SR of RCTs matched on patient population, intervention, control, and outcomes. 

Methods: We searched PubMed up to 2008 for DAs comparing at least two interventions followed by matching SRs of 
RCTs. Data were extracted on patient population, intervention, control, and outcomes from DAs and matching SRs of 
RCTs. Data extraction from DAs was done by one reviewer and from SR of RCTs by two independent reviewers. 

Results: We identified 28 DAs representing 37 comparisons for which we found matching SR of RCTs. Results of the 
DAs and SRs of RCTs were in concordance in 73% (27/37) of cases. The sensitivity analyses conducted in either DA or 
SR of RCTs did not impact the concordance. Use of single (4/37) versus multiple data source (33/37) in design of DA 
model was statistically significantly associated with concordance between DA and SR of RCTs. 

Conclusions: Our findings illustrate the high concordance of current DA models compared with SR of RCTs. It is 

shown previously that there is 50% concordance between DA and matching single RCT. Our study showing the 
concordance of 73% between DA and matching SR of RCTs underlines the importance of totality of evidence 
(i.e. SR of RCTs) in the design of DA models and in general medical decision-making. 



Background 

Medical decision-making requires a comprehensive analysis 
of benefits and harms associated with available treatment 
options. Randomized controlled trials (RCTs), and in turn 
systematic reviews (SRs) of RCTs, are considered the refer- 
ence standard in resolving treatment uncertainties [1,2]. 
However, in many instances RCTs and in turn SR of RCTs 
lack the power and long duration of follow up needed to 
assess the long-term outcome estimates [3]. Decision ana- 
lysis (DA) can be used to provide the required estimates to 
inform medical decision-making [4]. 

Decision analysis models have been used in medical- 
decision making since 1972 [5-7]. For a DA model to be 
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useful and applicable it should reflect real problems of 
patients and data on clinical outcome probabilities 
should be generated using a systematic approach. While 
DAs can allow users to make informed decisions when 
confronted with difficult clinical scenarios, their over- 
simplification of real world scenarios can be problematic 
[8]. DA models are often based on data derived from 
empirical studies with short-term follow up and the 
biases from these studies influence modeled outcomes 
[3]. Although there are guidelines for assessing the use- 
fulness of DA and their role in medical decision-making 
[9,10] very few studies have assessed the soundness of 
DAs. That is, how often DA results agree with findings 
of matching RCTs or SR of RCTs (published after the DA) 
has not been comprehensively investigated. We are aware 
of two studies that have assessed the soundness of DA 
using subsequent clinical study results [11,12]. The study 
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by Bress et al. conducted in the field of infectious diseases 
determined that findings of DAs were in concordance with 
findings of clinical studies (including RCTs and observa- 
tional studies) in 75% of the cases assessed [11]. We have 
previously shown that findings of DAs and matching RCTs 
are in concordance only 50% of the time [12]. However, it 
is not known how often findings of DAs correspond with 
matching SRs of RCTs. Accordingly, the objective of this 
study is to assess how often findings of DAs are in concord- 
ance with matching SRs of RCTs. 

Methods 

Eligibility criteria 

Decision analyses comparing two or more treatments were 
eligible for inclusion in this study. Systematic reviews of 
RCTs published after the matching DA models were in- 
cluded. That is, SR of RCTs published before the matching 
DA was not included in this study. If a matching SR for an 
included DA model was not found, the DA was excluded 
for the study. 

Information sources 

We searched PubMed and Cochrane library for identifying 
DA and matching SR of RCTs. 

Search strategy: decision analysis papers 

"Decision Support Techniques" was introduced as medical 
subject heading term in year 2000. Hence, we searched 
PubMed (Medline) for DAs from 01/2000 to 12/2008 for 
DAs using the following search strategy: ("Decision Support 
Techniques" [Mesh] OR ("Decision Making" [Mesh]) 
OR ("Decision Analysis") AND ("Therapeutics" [Mesh] 
OR "therapy "[Subheading] OR "Treatment Outcome" 
[Mesh] OR "Therapies, Investigational"[Mesh])). 

Search strategy: systematic reviews of RCTs 

We searched PubMed and Cochrane library for systematic 
reviews (SRs) of RCTs that matched the identified DAs 
based on patient population (P), intervention (I), control (C) 
and outcome (O) (PICO) criteria. Clinical Queries search 
strategies in Pubmed which have been updated based on 
the filter developed by Haynes et al. were also utilized to 
search for SRs of RCTs matching the DAs [13]. Systematic 
reviews of RCTs published after the matching DA models 
were included. Keywords from DA intervention and control 
arms were used and, if necessary, search returns were nar- 
rowed by using keywords from the DA patient population. 

Study selection 

Abstracts of all the identified studies were reviewed by 
one reviewer (RM) for inclusion according to the pre- 
determined criteria. In addition, 2 reviewers (BD and AK) 
randomly selected and reviewed 15% of the citations for 
inclusion to assess for accuracy. Another set of reviewers 



(HG and HW) reviewed list of all citations to identify 
matching SRs of RCTs for the included DAs. The list of 
matched SR of RCTs and DA were further confirmed in- 
dependently by 2 reviewers (RM and AK). Any disagree- 
ments in the selection process of DA and matching SR 
of RCTs were resolved by consensus. 

Data extraction 

Data were extracted from each included DA and SR of 
RCTs using a standardized data extraction form. Data 
were extracted on PICO elements from all DAs and 
matching SRs of RCTs. From each DA we also extracted 
data on whether single versus multiple data sources were 
used the design of DA model. From each included SR of 
RCTs data were also extracted on the number of RCTs 
included, sample size and year of publication. Data abstrac- 
tion from included DAs was done by one reviewer (RM) 
and from SR of RCTs by two reviewers (HG and HW). 
Senior reviewers (BD and AK) randomly selected and 
reviewed 15% of the extracted data from included stud- 
ies to assess for accuracy. 

Outcomes 

Matching of DA and SR of RCTs 

Abstracts of identified SRs of RCTs were reviewed by 
two reviewers (HW and HG) independently to deter- 
mine the degree of matching based on PICO elements 
as follows: Overall the matching of DA and SR was done 
for all individual PICO elements at 3 levels classified either 
as optimum, broad or broadest match. First the match was 
done at participant/patient population level followed by 
intervention(s), control and outcome(s). If the initial match 
at participant/disease level was not achieved, the DA was 
excluded from the review. PICO elements of SR of RCTs 
were considered an optimum match to a DA if it involved 
same PICO elements. The PICO elements of SR of RCTs 
were considered a broad match to a DA if it involved simi- 
lar PICO elements. The PICO elements of SR of RCTs were 
considered a broadest match to a DA if it involved only 
slightly similar PICO elements. 

Examples of optimum, broad and broadest match are 
shown in Table 1. In situations where multiple matches 
were found, the most recently published SR of RCTs 
was chosen. 

Concordance and impact of sensitivity analysis 

It is well established that majority of the DAs conduct 
and report sensitivity analyses and the final outcomes of 
the DA model may be influenced by these sensitivity 
analyses. Hence, to assess the impact of the DA outcomes 
after these sensitivity analyses on the concordance or dis- 
cordance with the findings of matching SR of RCTs we ex- 
tracted data on the sensitivity analyses. Specifically, for 
each included DA and SR of RCTs, two review authors 
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Table 1 Examples of matching based on PICO elements 



PICO element 


Decision analysis 


Systematic review of RCTs 


Matching category 


Participant (P) 


"premenopausal women with newly diagnosed 
hormone responsive breast cancer" [14] 


"premenopausal women with early breast cancer 
which was responsive for estrogen receptor" [15] 


Optimum 




"women with breast cancer" [1 6] 


"women with early stage breast cancer" [1 7] 


Broad 




" high risk women seeking prophylactic 
mastectomy" [18] 


"women with invasive breast cancer" [19] 


Broadest 


ntervention (IJ/Control (C) 


"breast conserving surgery" (intervention) and 
"mastectomy" (control) [1 6] 


"breast conserving surgery" (intervention) and 
"mastectomy" (control) [20] 


Optimum 




"medical ovarian suppression or surgical 
ovarian suppression" (intervention). [14] 


"medical ovarian suppression" (intervention) [15] 


Broad 




"conservation therapy (medical and other methods)" 
(control) [21] 


"medical therapy" for patients with non-acute 
coronary artery disease (control) [22] 


Broad 




"surgery" (intervention) [23] 


"laser excision" in patients with glottic cancer 
(intervention) [20] 


Broadest 


Outcome (0) 


"overall survival" [14] 


"overall survival" [15] 


Optimum 




"complications" [18] 


"morbidity" [19] 


Broad 




"breast cancer mortality" [24] 


"all-cause mortality" [25] 


Broadest 



(HW and HG) independently extracted data on the authors 
overall conclusion (indicating which treatment was better); 
whether author's conclusion changed after sensitivity 
analysis and whether the conclusion (s) of DA and SR of 
RCTs agreed or disagreed. Discrepancies between the 
two reviewers' judgments were resolved by discussion 
and mutual consensus with other reviewer (RM). 

Statistical analysis 

We used descriptive statistics to report concordance or 
discordance between results of DA and SR of RCTs. Impact 
of variables that are used in the construction of DA model 
on concordance between DA and matching SR of RCTs 
was assessed by Fishers' exact test. The impact of sample 
size of SR of RCTs on concordance of findings between DA 
and SR of RCTs was tested using Kruskal-Wallis one-way 
analysis of variance test [26]. 

Results 

Study selection 

The PubMed search for DAs yielded 42,704 citations 
(Figure 1). We excluded 42,617 studies after reviewing 
abstracts and found 87 studies that used DA modeling 
to compare two or more interventions. We found 
matching SR of RCTs for 32% (28/87) of DAs. These 28 
DAs included 37 comparisons for which we found a 
matching SR of RCTs. 

Characteristics of included DAs and SRs 

Infection (11/37) and cancer (10/37) were the most fre- 
quentiy studied diseases using DA modeling. The included 
DAs investigated effects of medications in 56% (21/37) of 
cases compared with surgical interventions in 38% (14/37) 
of cases (Table 2). Ninety five percent (35/37) of DAs did 



not collect any primary data and used data published in the 
literature in designing DA model. Only 5% (2/37) of DA 
models used a systematic approach (e.g. meta-analysis) 
to data collection. Similarly, 5% (2/37) of DA models used 
expert opinion in designing the DA model. Ninety seven 
percent (36/37) of DAs conducted sensitivity analysis 
while 51% (19/37) of included SRs of RCTs conducted 
sensitivity or sub-group analysis. The median sample size 
of included SR of RCTs was 2610 (range: 42 to 32523). 

Matching between DA and SR of RCTs 

As summarized in Table 3 the match between DA par- 
ticipant characteristics with SR of RCTs was considered 
optimum in 57% (21/37), broad in 21% (8/37) and broadest 
in 18% (8/37) of cases, respectively. The match for inter- 
ventions studied in DAs with interventions in the SR of 
RCTs was optimum in 95% (35/37) and broad and broadest 
in 1/37 cases each. Similarly, the matching of controls in 
DA with the controls used in the SR of RCTs was optimum 
in 92% (34/37), broad in 5% (2/37) and broadest in 3% 
(1/37) of cases (Table 3). 

Concordance between findings of DA and SR of RCTs 

Overall, the findings of the DAs and the SRs of RCTs were 
in concordance in 73% (27/37) of cases. Twenty-seven per- 
cent (10/37) of the SR of RCTs findings were discordant 
with the findings of the DA (Figure 2). 

Out of the 21 pairs of DA and SR with the optimum 
match of the patient characteristics 66% (14/21) of the 
DA findings were in concordance with the findings of 
the matching SR of RCTs. Out of the 8 pairs of DA and 
SR with the broad match of the patient characteristics 
87% (7/8) of the DA findings were in concordance with 
the findings of the matching SR of RCTs. Out of the 8 
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Citations retrieved from 
PubMed 
(n=42,704) 



Decision analyses comparing 
two or more interventions 
(n=87) 



Decision analyses with 
matching 
systematic reviews of RCTs 
N=28 
(Comparisons=37) 



Excluded (n=42,617) 

Non-DA (n=38,644) 
Reviews (n=l,332) 
Cost-analysis (n=l,195) 
RCTs (n=650) 
Comments (n=556) 
Editorials (n=240) 



Decision analyses with NO 

matching 
Systematic reviews of RCTs 

(n = 59) 



Figure 1 Flow diagram depicting study selection process. 



pairs of DA and SR with the broadest match of the pa- 
tient characteristics 75% (6/8) of the DA findings were 
in concordance with the findings of the matching SR of 
RCTs. There was no association between degree of match- 
ing based on patient characteristics and concordance of 
findings of DAs and SR of RCTs (p = 0.52). Similarly, there 
was no association between degree of matching based on 
intervention characteristics (p = 0.21) and control charac- 
teristics (p = 0.17) and concordance of findings of DAs and 
SR of RCTs. 

The majority of the sensitivity analyses conducted in DA 
(33/37) did not impact the concordance between findings 
of DA and matching SR of RCTs. In three cases, the find- 
ings of DA and matching SR of RCTs were similar after the 
sensitivity analysis [14,16,27]. 

Impact of decision analysis design attributes on concordance 

The findings of DAs using multiple data sources were 
more likely to be concordant with matching SRs of 
RCTs than DAs using single data source and this associ- 
ation reached statistical significance (p = 0.05) (Table 4). 



Incorporation of data from meta-analysis (p = 1.00), use of 
expert opinion (p = 0.06) and primary versus secondary 
data collection (p = 0.47) in the design of DA model did 
not have any impact on concordance between findings of 
DA and SR of RCTs (Table 4). It is important to note that, 
the meta analysis used to inform the design of the DAs 
were obviously published before the DA model and were 
not used for matching the results of DAs and SRs. The 
distribution of sample size of SR of RCTs was similar 
across the SRs which were in concordance with the find- 
ings of their matched DAs compared with SRs with the 
findings discordant with their matched DAs (p = 0.78). 

Discussion 

Summary of evidence 

To our knowledge, this is the first SR to date comparing 
the results of DAs with matching SRs of RCTs. The find- 
ings show that there is high level of concordance between 
findings of DA and matching SRs of RCTs and use of mul- 
tiple sources of data in decision analyses appears to increase 
the predictive value of DA. 
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Table 2 Characteristics of decision analysis and 
systematic review of RCTs 



Characteristic 

DA Disease category 

nfectious diseases 
Cancer 

Cardiovascular disease 
Venous ailments 
OB/GYN 
Other 

Pressure ulcers 
Crohns disease 
Obesity 

Achilles Tendon Rupture 
Anti-phospholipid antibody syndrome 
DA intervention type 

Medication 
Surgery 

Abstaining from breast feeding 
Early weaning from breast feeding 
DA control type 

Medication 

Observation 

Surgery 

Breast feeding 

Radiation 

Placebo 

SR of RCTs characteristic 

Median sample size (range) 



N (%) 

1 1 (29.7) 
10 (27) 

5 (13.5) 
3 (8.1) 
2 (5.4) 

6 (16.2) 
2(2) 

1 (D 
1 (D 
1 (D 
1 (D 

21 (56) 
14 (38) 
1 (3) 

1 (3) 

1 8 (48.6) 
1 0 (27) 
5 (13.5) 

2 (5.4) 
1 (2.7) 
1 (2.7) 

2610 (42-32523) 



100 

90 

80 

f 70 

"5 60 
a 

s 50 

£ 40 



73% 
(27/37) 




27% 



(10/37) 



Concordance Disagreement 

Figure 2 Concordance between findings of decision analysis 
(DAs) and systematic review of RCTs. 



Comparative effectiveness research (CER) is gaining 
popularity and employs assessment of multiple interven- 
tions by comparing their long-term outcomes. However, 
in many instances the RCTs and other studies that are 
used in CER lack the power and long duration of follow 
up needed to assess the long-term outcome estimates 
[3]. Over past 39 years, DA has been applied to a variety 
of clinical problems to provide these much desired esti- 
mates to improve clinical decision making. However, the 
concordance between findings of empirical efficacy studies 
that are used for decision making (i.e. RCTs) and DA find- 
ings is not known. This largest SR to date shows concord- 
ant findings of DAs and matching SRs of RCTs in 73% of 
cases. The findings from our study also emphasizes on the 
importance of SR as we have previously shown that results 
of DAs and matching single RCT disagree about only 50% 
of the time [12]. 



Table 4 Impact of DA design attributes on concordance 
between findings of DA and SR of RCTs 



Table 3 Degree of matching between DA and SR of RCTs 

Category N (%) 



Patient 

Matching with 
Matching with 
Matching with 
Intervention 

Matching with 
Matching with 
Matching with 
Control 

Matching with 
Matching with 
Matching with 



optimum criteria 
broad criteria 
broadest criteria 

optimum criteria 
broad criteria 
broadest criteria 

optimum criteria 
broad criteria 
broadest criteria 



21 (56.8) 
8 (21.6) 
8 (21.6) 

35 (94.6) 
1 (2.7) 

1 (2.7) 

34 (91 .9) 

2 (5.4) 
1 (2.7) 



Category 



Concordance 

N (%) 



Disagreement 
N (%) 



Data source 

Single data source 

Multiple data source 

Data from 
meta-analysis used 

Yes 

No 

Expert opinion used 

Yes 
No 

Primary data 
collection undertaken 

Yes 

No 



1(25) 
26(79) 

2(100) 
25(71) 

0(0) 
27(77) 

1(50) 
26(74) 



3(75) 
7(21) 



0(0) 
10(29) 

2(67) 
8(23) 



1(50) 
9(26) 



P-value 



0.05 



1.0 



0.06 



0.47 
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Our findings are also in line with the other research 
study on the topic. The study by Bress et al. focused on 
infectious diseases and assessed the concordance of DA 
findings compared with subsequent clinical study results 
[11]. The study by Bress et al. determined that findings 
of DAs were in concordance with findings of clinical 
studies in 75% of the cases assessed [11]. However, the 
study by Bress et al. was limited to the field of infectious 
diseases and compared findings of DAs with either RCTs 
or observational studies [11]. Moreover, this study by 
Bress et al. did not comprehensively report the impact 
of DA design attributes and sample size of matching 
SRs on concordance between findings of DAs and SRs. 
We also explored the reasons for concordance and dis- 
cordance between findings of DAs and matching SRs 
of RCTs employing multiple analyses. Specifically, we 
investigated the impact of DA design factors and sam- 
ple size of matching SR of RCTs on concordance and 
discordance between findings of DA and SR of RCTs. 
Our results indicate that none of the attributes except 
use of single versus multiple data source in the design 
of DA models is significantly associated with concordance 
of findings between DA and matching SR of RCTs. Sample 
size of matching SR of RCTs did not have any impact on 
concordance and discordance between findings of DAs 
and SR of RCTs either. Another factor that may impact 
concordance between findings of DA and SR is the degree 
of matching between DA and SR PICO attributes which 
was performed in our study. In our study, the intervention 
and controls studied in DAs closely matched with SRs in 
majority of the cases. However, patient population en- 
rolled in DAs closely matched with SR in 57% of cases. 
Nonetheless, the degree of patient population matching 
did not have any impact on concordance between findings 
of DA and SR of RCTs. 

Limitations 

Our study has some limitations. There were a relatively 
small number of published DAs and an even smaller num- 
ber with matching SR of RCTs. However, since DAs are 
mostly conducted when a RCT is not available this was 
expected. Nonetheless, our findings are based on small 
sample size (n = 37) and hence should be interpreted with 
caution. We did not search for unpublished DAs or SR of 
RCTs. As noted by Bress et al, our literature search also 
could not distinguish DA from other study designs. If 
"decision analysis" were a MeSH term, such searches 
would be more efficient and reproducible [11]. As a result, 
we reviewed large volume of citations (n = 42,704) and 
hence our search is not updated since 2008. 

Conclusions 

Our results show the high concordance of findings of 
current DA models compared with findings of SR of 



RCTs. Moreover, our results outline the importance of SR 
of RCTs compared with a single RCT in medical decision 
making. That is, the concordance between DA findings 
and matching single RCT findings was only 50% [12] but 
the concordance between findings of DA and matching 
totality of evidence (i.e. SR of RCTs) was 73%. This under- 
scores the importance of use of research synthesis in med- 
ical decision making. 

Our study findings are important and informative to the 
design of DA models. It is known that, unless all clinically 
important factors have been included, the DA lacks suffi- 
cient representativeness to be clinically useful [7,28-30]. 
Moreover, DA designs need to follow a consistent set 
of best practices for selecting (estimates from SR/MA 
rather than individual studies), adjusting for bias and 
incorporating empirical evidence [3,31-33]. Our findings 
further highlight the need of further investigation of 
the impact of DA design attributes such as use of 
meta-analysis data and data from multiple sources on 
clinical rationality of DA models. Investigation of influence 
of DA design attributes on usefulness of DA model in deci- 
sion making will further improve use of DAs in healthcare 
decision making and policy development. 
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